Two-pass Algorithm for Large Vocabulary Continuous Speech Recognition

نویسنده

  • Valeriy Pylypenko
چکیده

This paper presents a two-pass algorithm for Extra Large (more than 1M words) Vocabulary COntinuous Speech recognition based on the Information Retrieval (ELVIRCOS). The principle of this approach is to decompose a recognition process into two passes where the first pass builds the word subset for the second pass recognition by using information retrieval procedure. Word graph composition for continuous speech is presented. With this approach a high performances for large vocabulary speech recognition can be obtained.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extra large vocabulary continuous speech recognition algorithm based on information retrieval

This paper presents a new two-pass algorithm for Extra Large (more than 1M words) Vocabulary COntinuous Speech recognition based on the Information Retrieval (ELVIRCOS). The principle of this approach is to decompose a recognition process into two passes where the first pass builds the words subset for the second pass recognition by using information retrieval procedure. Word graph composition ...

متن کامل

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

First-Pass Large Vocabulary Continuous Speech Recognition using Bi-Directional Recurrent DNNs

We present a method to perform first-pass large vocabulary continuous speech recognition using only a neural network and language model. Deep neural network acoustic models are now commonplace in HMM-based speech recognition systems, but building such systems is a complex, domain-specific task. Recent work demonstrated the feasibility of discarding the HMM sequence modeling framework by directl...

متن کامل

Fast on-the-fly composition for weighted finite-state transducers in 1.8 million-word vocabulary continuous speech recognition

This paper proposes a new on-the-fly composition algorithm for Weighted Finite-State Transducers (WFSTs) in large-vocabulary continuous-speech recognition. In general on-the-fly composition, two transducers are composed during decoding, and a Viterbi search is performed based on the composed search space. In this new method, a Viterbi search is performed based on the first of two transducers. T...

متن کامل

New developments in the INRS continuous speech recognition system

New techniques are developed for the second pass search in our large vocabulary continuous speech recognition system. The merging of recognition hypotheses is proposed in order to linearize the exponential growth of the tree structure in the depth rst search. Branching ordering of the rst pass word graph and pruning at both word and phone levels are used to further speed up the search. The algo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007